Regular Article 



Psychotherapy 

and PsychoSOmatiCS PsychotherPsychosom2014;83:158-164 Received: May 31,2013 

DOI' 10 1159/000356191 Accepted after revision: September 30, 2013 

' ^^^^^^^^^^ Published online: April 12, 2014 



The Efficacy of Antidepressants on Overall 
Weil-Being and Self-Reported Depression 
Symptom Severity in Youth: A Meta-Analysis 

Glen I. Spielmans 3 ' b Katherine Gerwig 3 

a Department of Psychology, Metropolitan State University, Saint Paul, Minn., and b Department of Counseling Psychology, 
University of Wisconsin, Madison, Wise, USA 



Key Words 

Antidepressant • Meta-analysis • Children • Adolescents • 
Depression • Quality of life ■ Self- and clinician ratings • 
Well-being 



small number of trials, our analyses suggest that antidepres- 
sants offer little to no benefit in improving overall well-being 
among depressed children and adolescents. 
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Abstract 

Background: Recent meta-analyses of the efficacy of sec- 
ond-generation antidepressants for youth have concluded 
that such drugs possess a statistically significant advantage 
over placebo in terms of clinician-rated depressive symp- 
toms. However, no meta-analysis has included measures of 
quality of life, global mental health, self-esteem, or autono- 
my. Further, prior meta-analyses have not included self-re- 
ports of depressive symptoms. Methods: Studies were se- 
lected through searching Medline, PsyclNFO, and the Co- 
chrane Central Register for Controlled Trials databases as 
well as GlaxoSmithKline's online trial registry. We included 
self-reports of depressive symptoms and pooled measures 
of quality of life, global mental health, self-esteem, and au- 
tonomous functioning as a proxy for overall well-being. Re- 
sults: We found a nonsignificant difference between sec- 
ond-generation antidepressants and placebo in terms of 
self-reported depressive symptoms (k = 6 trials, g = 0.06, p = 
0.36). Further, pooled across measures of quality of life, glob- 
al mental health, self-esteem, and autonomy, antidepres- 
sants yielded no significant advantage over placebo (k = 3 
trials, g = 0.1 1, p = 0.13). Discussion: Though limited by a 



The efficacy of second-generation antidepressants (in- 
cluding selective serotonin reuptake inhibitors, sero- 
tonin/norepinephrine agonists, bupropion, and nefazo- 
done) in treating depression in children and adolescents 
has been examined in many placebo- controlled trials. 
Some meta-analyses found little to no evidence of efficacy 
for antidepressants in treating child and adolescent de- 
pression [1, 2]. However, more recent meta-analyses 
based on a larger sample of trials reported a small and 
statistically significant benefit [3-5] . The most widely cit- 
ed meta-analysis on youth antidepressant efficacy con- 
cluded that the benefits from second-generation antide- 
pressants on clinician-rated measures of depressive 
symptoms 'appear to be much greater' than suicidality- 
related risks for treating depression in youth [4]. Prior 
meta-analyses have relied on clinician-rated measures of 
depression, with the preferred measure being the Chil- 
dren's Depression Rating Scale (CDRS) [6]. However, 
several other measures were used in these trials, but not 
included in subsequent meta-analyses. 

Measures of social functioning and health- related 
quality of life are useful guides in determining the effec- 
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tiveness of antidepressants; further, calls have been made 
for the regular inclusion of such measures in clinical trials 
of antidepressants [7-9]. If a treatment improves overall 
well-being, benefits should be apparent on rating scales 
measuring domains such as social functioning and qual- 
ity of life [10]. Symptom-based measures such as the 
CDRS are designed to be sensitive to the presence and 
severity of symptoms but not necessarily other compo- 
nents or important correlates of well-being, such as glob- 
al mental health, quality of life, self-care, self-esteem, or 
social functioning. Although measures of such constructs 
were included in several youth antidepressant trials, they 
have received very little attention in subsequent review 
articles. Thus, the extent to which antidepressants im- 
prove functioning and quality of life among depressed 
youth is essentially unknown, representing a major gap in 
understanding their true risk-benefit ratio. 

Some studies have found a weak relationship between 
depressive symptoms and functioning and that function- 
ing improves more slowly than symptoms [11]. However, 
4 of 5 recent acute-phase antidepressant studies in adults 
found that the effect size difference favoring antidepres- 
sants over placebo was quite similar on (a) measures of 
depressive symptoms and (b) measures of functional im- 
pairment or quality of life [12-16]. Thus, acute-phase tri- 
als appear capable of detecting differences between anti- 
depressants and placebo in terms of functioning and 
quality of life. 

Further, self-reports of depressive symptom severity 
have frequently been utilized in youth antidepressant tri- 
als, yet these measures have not been included in any 
quantitative synthesis of the literature. Self-reports are a 
valuable yet underutilized component of assessing out- 
comes in adult antidepressant treatment [17], and self- 
report symptom measures should serve a valuable role in 
assessing outcomes of youth antidepressant trials as well. 
Thus, the purpose of this meta-analysis was to system- 
atically examine the effects of antidepressants versus pla- 
cebo on self-report depression measures as well as mea- 
sures assessing quality of life, functioning, and overall 
well-being. 



Methods 

Search Strategy 

To identify studies for use in the meta-analysis, we searched 
Medline, PsycINFO, and the Cochrane Central Register for Con- 
trolled Trials using the terms depress* and each of the following 
drug names individually in separate searches: bupropion, citalo- 
pram, duloxetine, escitalopram, fluvoxamine, fluoxetine, milnacip- 



ran, mirtazapine, nefazodone, paroxetine, sertraline, venlafaxine 
and (placebo in all text) and (child* OR adolescen*). In Medline, we 
limited the search to the publication types: clinical trial, controlled 
clinical trial, and randomized controlled trial. In PsycINFO, we 
limited the methodology to treatment outcomes/clinical trials. We 
also searched GlaxoSmithKline's online clinical study registry to 
identify possible studies for inclusion. The searches were conducted 
in October 2012. We also examined all references contained in each 
study meeting our inclusion criteria as well as three previously pub- 
lished meta-analyses for studies that met our criteria [2-4]. 

Study Selection 

All included studies were acute-phase, placebo-controlled tri- 
als in which participants were randomly assigned to groups. Ex- 
cluded were trials in which study participants were given adjunc- 
tive or augmented treatments, not focused on the treatment of de- 
pression, or in which participants were not diagnosed with 
depression. To meet inclusion criteria, studies had to report scores 
from at least one relevant measure among the following domains: 
self-report of depressive symptoms, or a measure of quality of life, 
functioning, global mental health, or well-being. Data in the in- 
cluded studies had to be reported in a manner that allowed for the 
calculation of an effect size. 

Data Extraction 

Both study authors coded descriptor data. Disagreements were 
resolved by consensus and comparison with the original study. 
Coders were not blinded to study results. 

Descriptor variables coded for each study were as follows: 

(1) drug dosage; if a flexible dose was used then the mean dose 
taken by participants was recorded; 

(2) flexible dose or fixed dosing regimen; 

(3) flexible dosage range; 

(4) trial duration in weeks beginning when the double-blind treat- 
ment was administered; 

(5) source of funding for the study; 

(6) clinician-rated measures which measured depression symptom 
severity quality of life, functioning, or overall well-being; 

(7) self-report or parent report measures which measured depres- 
sion symptom severity, quality of life, functioning, global men- 
tal health, or overall well-being, and 

(8) all parent report measures used in the study. 

Outcome Measures 

We included self-reports and clinician-rated measures of de- 
pression, using these as continuous outcomes. We did not extract 
information on rates of treatment response/remission based on 
underlying continuous measures. The following clinician-rated 
depression measures were used: Children's Depression Rating 
Scale [6] and Kiddie Schedule for Affective Disorders and Schizo- 
phrenia [18]. We did not use data from either the Hamilton De- 
pression Rating Scale [19] or the Montgomery- Asberg Depression 
Rating Scale [20] because these measures were designed for use in 
adults and may not be ideal to assess depressive severity in youth. 
Data reporting in one study was quite minimal [21]. For this study, 
we extracted mean change data on the Kiddie Schedule for Affec- 
tive Disorders and Schizophrenia from a figure in the paper, in 
which it was clear that the mean difference between groups was 
zero. We estimated data on standard deviation units in this trial 
from other trials in our meta-analysis that used the same measure, 
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Table 1 . Characteristics of included studies 





Sample size 


Drug dosage, mg 


Flexible dose 


Flexible dose 


Trial duration, 


Funding 






vs. fixed dose 


range, mg 


weeks 


source 


Citalopram 














von Knorring et al. 


drug = 103 


26 


flexible 


10-40 


12 


sponsor 


[21] 


placebo = 95 












Fluoxetine 














Emslie et al. [57] 


drug = 48 


20 


fixed 




8 


government 




placebo = 48 












Emslie et al. [58] 


drug = 109 


20 


fixed 




9 


sponsor 




placebo =110 












TADS [59] 


drug = 109 


33.3 


flexible 


10-40 


12 


government 




placebo =112 












Paroxetine 














Emslie et al. [65] 


drug = 101 


28.3 (completers); 


flexible 


10-50 


8 


sponsor 




placebo =102 


20.4 (overall) 










Berard et al. [22] 


drug =182 


26.1 (mean maximum daily); 


flexible 


20-40 


12 


sponsor 




placebo = 93 


25.8 (endpoint LOCF) 










Keller et al. [23] 


drug = 93 


28 (endpoint) 


flexible 


20-40 


8 


sponsor 




placebo = 87 












Sertraline 














Wagner et al. [39] 


drug =185 


131 


flexible 


50-200 


10 


sponsor 




placebo =179 











LOCF = Last observation carried forward. 



weighting for the sample size in the two trials [22, 23] . For the Beck 
Depression Inventory (BDI), we calculated the standard deviation 
in von Knorring et al. [21] as the sample size-weighted standard 
deviation from Berard et al. [22] and Emslie et al. [58] . For the Kid- 
die Schedule for Affective Disorders and Schizophrenia, we calcu- 
lated the standard deviation in von Knorring et al. [21] as the sam- 
ple size-weighted standard deviation from Berard et al. [22] and 
Keller et al. [23]. 

We extracted data from all self-rated measures of depression, 
which included the BDI [24]; Children's Depression Inventory 
[25]; Kutcher Adolescent Depression Scale [26]; Mood and Feel- 
ings Questionnaire [27]; Reynolds Adolescent Depression Scale 
[28] , and Weinberg Screening Affective Scale [29] . We included all 
measures of quality of life, functioning, autonomy, self-esteem, 
positive affect, and global mental health. Because we found very 
few measures within each aforementioned category (including no 
positive affect measures), we opted to pool these measures as a 
proxy for overall well-being. In a recent meta-analysis of atypical 
antipsychotic augmentation treatment for adult depression, mea- 
sures of quality of life and functioning were similarly pooled to 
create a composite for well-being [9]. 

Quality of life was assessed with the self-reported Pediatric 
Quality of Life Enjoyment and Satisfaction Questionnaire and the 
Sickness Impact Profile. Autonomy/functioning were assessed 
with the parent-rated Autonomous Functioning Checklist [30]. 
Self-esteem was assessed via the Self- Perception Profile [31]. 
Health of the Nation Outcome Scales for Children and Adoles- 



cents (HoNOSCA) were used to assess global mental health [32]. 
Data were typically reported in the main study publication, but in 
three instances, data were extracted from secondary papers which 
were more focused on patient-rated outcomes than were the pri- 
mary study publications [33-35]. 

Statistical Analysis 

Effect sizes were calculated from means and standard devia- 
tions whenever possible. When these were not provided, effect siz- 
es were computed based on means and p values. Each effect size 
was weighted by its inverse variance in order to provide a pooled 
effect size estimate that most accurately approached the true pop- 
ulation effect size [36]. To correct for a small bias in Cohen's d, we 
converted effect sizes to Hedges' g [36]. Homogeneity analyses 
were performed using the Q statistic. To describe heterogeneity, 
we also calculated the I 2 statistic [37]. To pool estimates across 
studies while incorporating potential heterogeneity, we employed 
a random effect model in all analyses [38]. 



Results 

The supplementary figure 1 (for all online suppl. mate- 
rial, see www.karger.com/doi/10.1159/000356191) shows 
the flow of our search and table 1 describes the character- 
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Table 2. Pooled effect sizes and heterogeneity of effects 



Domain 


Age 


Rater 


K 


r,rrect size l^gj, yj/o d 


P (g) 




« (C\\ 

P \SD 


t2 o/ 

1 , 70 


Depression 


adolescent 


,. 

clinician 


D 


U.Zl ^U.U/ IO XJ.D^t) 


U.UUj 


O.OJ 


U.Z D 








self 


5 


0.05 (-0.08 to 0.19) 


0.44 


1.18 


0.88 


0 




child 


clinician 


2 


-0.07 (-0.61 to 0.47) 


0.81 


4.71 


0.03 


78.78 






self 


1 


-0.05 (-0.40 to 0.30) 


0.78 


N/A 


N/A 


N/A 




combined 1 


clinician 


8 


0.25 (0.09 to 0.40) 


0.002 


17.85 


0.01 


60.78 






self 


6 


0.06 (-0.06 to 0.18) 


0.36 


2.15 


0.83 


0 


Well-being 2 


adolescent 


self or parent 


3 


0.16 (-0.01 to 0.33) 


0.07 


1.67 


0.43 


0 




children 


self 


1 


0.08 (-0.24 to 0.39) 


0.62 


N/A 


N/A 


N/A 




combined 


self or parent 


3 


0.11 (-0.03 to 0.26) 


0.13 


0.37 


0.83 


0 



1 Some studies only reported results for the combined age group (not separately for children and adolescents). Thus, the combined 
category results are not a simple combination of the effect sizes under the adolescent and child age groupings. 

2 Well-being was defined as a composite of the following outcomes: quality of life, functioning, autonomy, global mental health, and 
self-esteem. N/A = Not applicable because only one trial was included in the analysis. 



istics of the included studies. Effect sizes for each depres- 
sion measure are provided in supplementary table 1 while 
supplementary table 2 presents effect sizes for nondepres- 
sion measures. Table 2 shows the aggregate effect sizes for 
various measure types. Clinician-rated measures of de- 
pression yielded a small and statistically significant ag- 
gregate effect for both adolescents (g = 0.21) and across 
all ages (g = 0.25); both of these effects are of questionable 
clinical significance. Some studies only reported results 
for the combined age group (not separately for children 
and adolescents). Thus, the combined category results are 
not a simple combination of the effect sizes listed under 
the adolescent and child age groupings. Results for self- 
rated depression measures were very small and statisti- 
cally nonsignificant across each age category. Measures of 
quality of life, functioning, autonomy, and global mental 
health were rarely reported. These outcomes yielded very 
small and statistically nonsignificant effects for all age 
groups, though the effect for adolescents nearly reached 
statistical (but not clinical) significance (p = 0.07, g = 
0.16). 

The distributions of effect sizes showed no signal of 
heterogeneity for any of the self- or parent-report mea- 
sures. Clinician-rated depression measures showed no 
substantial heterogeneity for adolescents (p = 0.25, 1 2 = 
24.61%), statistically significant and high heterogeneity 
in the two child trials (p = 0.03, 1 2 = 78.78%), and statisti- 
cally significant moderate heterogeneity in the combined 
age groups (p = 0.01, 1 2 = 60.78%). 

As selective inclusion of trials may have impacted our 
ability to detect treatment effects on self-report measures, 
we undertook additional analyses. First, we examined the 



overall effect size on clinician-rated measures in studies 
which also included self- reported depression measures 
for the same age group; thus, two trials without depres- 
sion self-report data were excluded [23, 39]. The aggre- 
gate effect size across these six trials on clinician reports 
was small and statistically significant: g = 0.28 (95% CI: 
0.08-0.48). On depression self-reports in the same stud- 
ies, the overall effect size was very small and not statisti- 
cally significant (g = 0.06, 95% CI: -0.06 to 0.18, p = 0.36). 
Thus, the trials which used self-reports of depression se- 
verity were not a subset that found poorer efficacy than 
the typical trial. 

Discussion 

We found no evidence that antidepressants offer any 
sort of clinically meaningful benefit for youth on self-re- 
port measures of depression, quality of life, global mental 
health, or parent reports of autonomy. The debate around 
antidepressant efficacy among youth has nearly entirely 
involved differing opinions of how to interpret small but 
statistically significant effect sizes on clinician -rated mea- 
sures of depression severity. The present meta-analysis 
adds to this discussion by summarizing the results of sev- 
eral patient- rated and parent- rated outcome measures 
across relevant trials. 

Our study has several limitations, the most obvious 
being the small sample of included trials. A larger sample 
of relevant trials may lead to differing conclusions. How- 
ever, even the strongest signal of efficacy in our results 
(a pooled statistically nonsignificant effect of g = 0.16 
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across measures of autonomous functioning, self-esteem, 
global mental health, and quality of life among adoles- 
cents) provides little reason to suspect any robust treat- 
ment effects. Due to their lack of use in the included tri- 
als, we did not include measures of positive affect. As 
positive affect is generally conceptualized as an impor- 
tant component of subjective well-being, this is an im- 
portant limitation of the literature in this area and of our 
review [40]. 

The revised CDRS was the most frequently used clini- 
cian-rated measure of depression severity in our analysis 
and is considered the 'gold standard' in the field. The re- 
vised CDRS appears to possess strong validity [41]. How- 
ever, there is no firm empirical basis for placing undue 
value on this measure as opposed to other outcomes ex- 
amined in our analysis. Indeed, in adult depression re- 
search, it has been argued that measures outside of clini- 
cian-rated depression scales add important information 
in understanding overall well-being [7, 9-11]. 

The self-report measures included in our analysis ap- 
pear to have solid psychometric properties. The BDI [42- 
44], Children's Depression Inventory [45], and Reynolds 
Adolescent Depression Scale [46, 47] all have evidence of 
psychometric validity. The Mood and Feelings Question- 
naire appears to be sensitive to change [48]. Little re- 
search has been conducted regarding the psychometric 
properties of the Weinberg Screening Affective Scale. It 
was used in only one study in which it generated an effect 
size of 0.28, somewhat higher than the effect size of 0.16 
observed on the combined BDI/Children's Depression 
Inventory in the trial. Excluding the Weinberg Screening 
Affective Scale would make no substantial difference in 
our findings. The Kutcher Adolescent Depression Scale 
appears to yield change similar to that observed on the 
revised CDRS [23,26]. 

The HoNOSCA has demonstrated sensitivity to 
change during treatment and respectable concurrent 
validity [49]. Correlations between the HoNOSCA and 
parent reports of psychopathology have been modest, but 
this does not necessarily impugn the validity of the 
HoNOSCA given that parent and child reports of psy- 
chopathology often diverge. Indeed, the well-established 
finding that various measures completed by different in- 
formants yield differing outcomes is not particularly sur- 
prising. Rather than suggesting that one measure is in- 
valid, it is more likely that each measure assesses a some- 
what separate outcome and that the totality of evidence 
across measures is more valid than the evidence provided 
by any single outcome [50] . The Pediatric Quality of Life 
Enjoyment and Satisfaction Questionnaire appears to as- 



sess a dimension of change untapped by the revised 
CDRS [51]. Initial impressions of the psychometric prop- 
erties of the Pediatric Quality of Life Enjoyment and Sat- 
isfaction Questionnaire in youth are favorable [51]. The 
Sickness Impact Profile has apparently not been used fre- 
quently in youth depression trials, but when used in a 
comparative trial of fluoxetine and sertraline in adults, 
the measure was clearly sensitive to change [52]. The 
Self-Perception Profile has established reliability and va- 
lidity [53]. The parent-rated Autonomous Functioning 
Checklist has little history of use in depression; its psy- 
chometric properties for youth with depression are not 
established. Measures included in our analysis have 
mainly been investigated in terms of psychometric prop- 
erties rather than their clinical utility. Better attention to 
a fuller, richer clinimetric assessment would improve the 
validity of clinical trials in assessing meaningful out- 
comes [54, 55]. 

The small sample of studies using each of the quality 
of life, functioning, autonomy, and global mental health 
measures necessitated pooling outcomes on these instru- 
ments. It is possible that these measures assess quite dif- 
fering constructs and that treating them as a single pooled 
outcome is inappropriate. However, there was no signal 
of heterogeneity in our pooled analysis of these measures, 
suggesting that we did not combine some measures which 
showed substantial benefit with others showing no ben- 
efit. More research is clearly needed to determine the ex- 
tent to which treatment may impact these important, of- 
ten overlooked outcomes. Indeed, given the value of self- 
reported subjective outcomes, psychiatric clinical trials 
lose validity to the extent that such measures are not in- 
cluded or not clearly reported. 

Given the high emphasis on clinician -rated depression 
measures in the reporting of clinical trial outcomes and 
subsequent reviews [2-4, 56], it seems that even the mod- 
est efficacy found in prior antidepressant meta-analyses 
is inflated. Perhaps this is best illustrated by fluoxetine. 
The discord between small to moderate effect sizes on 
clinician-rated measures in three trials (0.52, 0.60, and 
0.40) [57-59] and negligible to quite modest effects on 
self-reports (-0.07, 0.22, and 0.15) is notable. Further, the 
only fluoxetine trial to report quality of life and global 
mental health outcomes found no treatment benefit [35] . 
When considering only clinician-rated depression out- 
comes, fluoxetine is often considered the most efficacious 
antidepressant in youth [4, 60, 61]. If even the purport- 
edly most efficacious antidepressant shows essentially no 
benefit over placebo on depression self- report measures 
and global mental health, then this speaks poorly of the 
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efficacy of antidepressants as a class in improving overall 
well-being among youth. 

A widely cited meta-analysis in the Journal of the 
American Medical Association concluded that the suicid- 
ality-related risks of antidepressants were outweighed by 
the quite small and likely clinically insubstantial effect 
(d = 0.20) on clinician-rated depression measures [4]. 
Our analysis found no benefit at all on self-reported de- 
pression measures and a small, not quite statistically sig- 
nificant benefit on measures assessing overall well-being. 
It is unclear exactly how different outcome measures 
should be weighed, but our findings suggest that the over- 
all benefits of antidepressants in youth have been over- 



stated and that their overall benefit over placebo may be 
vanishingly small. Further, to help clarify the risk-benefit 
ratio, an examination of risks outside of suicidality is 
needed [62]. For instance, a recent systematic review 
found a much elevated risk of excessive arousal/agitation 
among youth taking antidepressants versus placebo [63] . 
Data from a Food and Drug Administration systematic 
review also found that antidepressants were linked to a 
statistically significantly higher rate of hostility or agita- 
tion relative to placebo [64]. Clearly, a more expansive 
examination of the risk-benefit ratio of antidepressants in 
youth, extending beyond clinician-rated depression mea- 
sures and suicidality, is needed. 
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